Performance improvement of a bitstream-based front-end for wireless speech recognition in adverse environments

نویسندگان

  • Hong Kook Kim
  • Richard V. Cox
  • Richard C. Rose
چکیده

In this paper, we propose a feature enhancement algorithm for wireless speech recognition in adverse acoustic environments. A speech recognition system is realized at the network side of a wireless communications system and feature parameters are extracted directly from the bitstream of the speech coder employed in the system, where the feature parameters are composed of spectral envelope information and coder-specific information. The coder-specific information is apt to be affected by environmental noise because the speech coder fails to generate high quality speech in noisy environments. We first found that enhancing noisy speech prior to speech coding improves the recognizer’s performance. However, our aim was to develop a robust front-end operating at the network side of a wireless communications system without regard to whether speech enhancement was applied at the sender side. We investigated the effect of a speech enhancement algorithm on the bitstream-based feature parameters. Consequently, a feature enhancement algorithm is proposed which incorporates feature parameters obtained from the decoded speech and a noise suppressed version of the decoded speech. The coder-specific information can also be improved by re-estimating the codebook gains and residual energy from the enhanced residual signal. HMM-based connected digit recognition experiments show that the proposed feature enhancement algorithm significantly improves recognition performance at low signal-to-noise ratio (SNR) without causing poorer performance at high SNR. From large vocabulary speech recognition experiments with far-field microphone speech signals recorded in an office environment, we show that the feature enhancement algorithm greatly improves word recognition accuracy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature enhancement for a bitstream-based front-end in wireless speech recognition

In this paper, we propose a feature enhancement algorithm for wireless speech recognition in adverse acoustic environments. A speech recognition system is realized at the receiver side of a wireless communications system and feature parameters are extracted directly from the bitstream of the speech coder employed in the system. The feature parameters are composed of spectral envelope and coder-...

متن کامل

A bitstream-based front-end for wireless speech recognition on IS-136 communications system

In this paper, we propose a feature extraction method for a speech recognizer that operates in digital communication networks. The feature parameters are basically extracted by converting the quantized spectral information of a speech coder into a cepstrum. We also include the voiced/unvoiced information obtained from the bitstream of the speech coder in the recognition feature set. We performe...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Speech and Audio Processing

دوره 10  شماره 

صفحات  -

تاریخ انتشار 2002